The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
A maximally decimated (MD) structure is presented for adaptive filtering based on perfect reconstruction (PR) transmultiplexers (TMUX) and a system representation that results from their dual filter banks (FB). The structure is computationally very efficient adapting at the low data rate a number of coefficients that exceeds only by a small amount the length of the unknown system. For colored inputs...
Combining spatial and rank (SR) order information into filtering methods has been exploited in recent nonlinear filtering algorithms. Fuzzy ordering theory incorporates sample spread information into the SR ranking framework and develops the concepts of fuzzy ranking and fuzzy order statistic. In this paper, fuzzy concepts are utilized to generalize weighted median filters and optimization of the...
LHQ (long history quantization), when applied in conjunction with scalar quantization for the coding of LSP (line spectrum pair) coefficients in a CELP (code excited linear prediction) system, offers transparent quantization at an average bit rate of 25 bits/frame. Certain improvements to the LHQ-scalar quantization scheme are presented which further reduce the average LSP bit rate to about 22 bits/frame...
The authors propose an efficient vector quantization scheme and a novel linear predictive coding (LPC) analysis scheme, both of which exploit interframe correlation in the successive spectrum envelope of speech signals. The first quantization scheme proposed is a multistage vector quantization of line spectrum pair (LSP) parameters with a partially adaptive codebook (MSVQ-AC). The second algorithm...
Immittance spectral pairs (ISPs) form a new set of parameters for representing the linear predictive coding (LPC) filter. For a filter of order n ISP consists of a gain and n-1 frequency parameters, instead of n frequency parameters as is the case for line spectrum pair (LSPs). In regarding LPC as a pseudo-model for the vocal tract, ISP can represent the immitance at the glottis without imposing,...
A mapping of a binary block code is used to generate the reconstruction vectors of a vector quantizer. The aim is to secure channel robustness while allowing for efficient design, storage, and handling of the vector quantizer. The general procedure is exemplified by the task of spectral coding for speech transmission. Using an LSP (line spectrum pair)-representation for the spectrum, it is demonstrated...
Trellis-coded vector quantization (TCVQ) is used to encode line spectrum pair (LSP) parameters. Three intraframe encoding schemes are considered: direct encoding of the LSP vectors, encoding the differences of LSP parameters, and using nonlinear prediction in encoding the LSP vectors. The last encoding scheme is the best; it can achieve 1-dB spectral distortion using about 27 bits for each speech...
A combined quantization-interpolation of speech line spectrum pair (LSP) parameters is proposed. To utilize the linear dependency between successive LSP frames, the proposed algorithm attempts to locate the frames where there is a significant spectral change; these frames are encoded by vector quantization. The remaining frames are reconstructed by linear interpolation between the vector-quantized...
In many vocoders LSFs (line spectrum frequencies) are used to encode the linear predictive coding (LPC) parameters. An interframe differential coding scheme is presented for LSFs. The LSFs of the current speech frame are predicted by using both the LSFs of the previous frame and some of the LSFs of the current frame. Then the difference vector resulting from prediction is vector quantized. The proposed...
Techniques for vector quantization of the line spectrum frequency (LSF) representation of speech spectra are described. Applications for spectral vector quantization include very low rate (300 to 1200 bits/s) speech coders, the half rate digital cellular speech coders, and many other linear prediction-based speech coders. Some of the techniques and tradeoffs for the quantization of LSF speech spectra...
The authors consider the estimation of powerful statistical language models using a technique that scales from very small to very large amounts of domain-dependent data. They begin with improved modeling of the grammar statistics, based on a combination of the backing-off technique and zero-frequency techniques. These are extended to be more amenable to the particular system considered here. The resulting...
Linguistic structure in the form of a partial-coverage phrase structure grammar is combined with statistical N-gram techniques. The result is a robust statistical grammar which explicitly incorporates linguistic and semantic structure. This approach makes it possible to model carefully those parts of the input that are important for an application and to use robust techniques that provide a full-coverage...
A bigram class model which gives the probability of a word class given its predecessor class has been developed. Simulated annealing is used to classify automatically the words of large text corpora. A first validation of the use of simulated annealing in language modeling is presented. Results are presented using a French corpus of 40000 words and a German corpus of 100000 words. It is demonstrated...
Ongoing efforts at adaptive statistical language modeling are described. To extract information from the document history, trigger pairs are used as the basic information-bearing elements. To combine statistical evidence from multiple triggers, the principle of maximum entropy (ME) is used. To combine the trigger-based model with the static model, the latter is absorbed into the ME formalism. Given...
A novel recursive transition network speech decoder designed for robust processing of spontaneous spoken input is described. Two levels of stochastic language models are used in the recognition search as well as the rule-based network constraints. The authors describe the basic decoder and system architecture and evaluate the system against a loosely coupled system on spontaneous spoken dialogues...
Prosodic patterns provide important cues for resolving syntactic ambiguity, and can be used to improve the accuracy of automatic speech understanding. With this goal, the authors propose a method of scoring syntactic parses in terms of observed prosodic cues, which can be used in ranking sentence hypotheses and associated parses. Specifically, the score is the probability of a hypothesized word sequence...
A linguistic analyzer based on KCTs (keyword classification trees) was trained on sentences from the ATIS (Air Travel Information System) air travel task and incorporated into the system (CHANEL) built at CRIM (Centre de Recherche Informatique de Montreal) for the Nov. 1992 ATIS benchmarks. Word sequences were processed by a local parser that identified semantically important noun phrases and then...
In evaluating speech recognition, an alignment of reference symbols with hypothesized symbols is the basis of other measures. The authors report on advances made at the National Institute of Standards and Technology on algorithms for alignment. They empirically justify phonological alignment, which minimizes differences in phonological features. A novel technique for identifying splits and merges...
A constraint-based parser capable of processing a word graph containing multiple sentence hypotheses has been developed. When syntactic constraints are applied to a word graph, this parse is able to prune the graph of many ungrammatical sentence hypotheses and limit the possible parses of the remaining sentences. However, in many cases syntactic information alone is insufficient for selecting a single...
A heuristic-based parsing algorithm for general stochastic context-free grammars is presented. The algorithm is basically a top-down parser that combines an extension of the A* search paradigm with a constraint satisfaction procedure. Heuristics are used to increase the likelihood of making a correct choice while constraint satisfaction eliminates states that cannot lead to a solution. Two simple...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.